Peter Meißner / 2016-02-29 – 2016-03-04 / ECPR WSMT
find a course taster at:
http://pmeissner.com/downloads/user2015_meissner_webscraping.pdf
| phase | problems | examples |
|---|---|---|
| download | protocols | HTTP, HTTPS, POST, GET, … |
| procedures | cookies, authentication, forms, … | |
| ————– | ————– | —————————— |
| extraction | parsing | translating HTML (XML, JSON, …) into R |
| extraction | getting the relevant parts | |
| cleansing | cleaning up, restructure, combine |
Bailer, Meißner, Ohmura, Selb (2013): Seiteneinsteiger im Deutschen Bundestag. Springer VS
Bailer, Meißner, Ohmura, Selb (2013): Seiteneinsteiger im Deutschen Bundestag. Springer VS
##…
…